# Load necessary packages
using Pkg
Pkg.activate(pwd())
using Dates, CSV, DataFrames, HTTP, StatisticsGlobal Temperature Anomalies
Data compilation
This notebook downloads and compiles the data used in the article “Breaching 1.5°C: Give me the odds” by Vera-Valdés and Kvist (2024). It contains the code used to download the data from the HadCRUT5, GISTEMP, NOAAGlobalTemp, Berkeley Earth, and ONI datasets. For each dataset, the preindustrial level is computed making it easy to compare the temperature anomalies across datasets. The data is downloaded in CSV format and directly accessible from the notebook. At the end of the notebook, all data from the different sources is stored in a single CSV file. The notebook also includes code to plot the data and highlight El Niño and La Niña events.
Global Temperature Anomalies, HadCRUT5, GISTEMP, NOAAGlobalTemp, Berkeley Earth, ONI, El Niño, La Niña
Introduction
This notebook downloads the data used in the article “Breaching 1.5°C: Give me the odds” by Vera-Valdés and Kvist (2024). It contains the code used to download the data from the HadCRUT5, GISTEMP, NOAAGlobalTemp, Berkeley Earth, and ONI datasets.
The notebook shows the code so that you can easily use whichever dataset you want. The data is downloaded in CSV format and directly accessible from the notebook. The data files are also stored in the data folder. Each CSV file contains three columns: Date, RawTemperature, and Temp. The Date column contains the date of the temperature anomaly in months, the RawTemperature column contains the raw temperature anomaly according to the dataset, and the Temp column contains the temperature anomaly relative to the preindustrial level. The preindustrial level is defined as the average temperature anomaly from 1850 to 1900. The Temp column is calculated by subtracting the preindustrial level from the RawTemperature column. The ONI dataset contains two columns: Date and Anom, where Anom is the ONI anomaly.
The code is written in Julia and is organized into sections that correspond to the different datasets. Each section downloads the data, processes it, and saves it in a CSV file. The data is then merged into a single dataset that contains the temperature anomalies for each dataset, as well as the ONI data. Finally, the notebook includes code to plot the data and highlight El Niño and La Niña events.
Load Packages and Functions
We have to load the necessary packages before running the code. We will use the Dates, CSV, DataFrames,HTTP, and Statistics packages to download and process the data. Statistics is part of the Julia standard library, so it is already installed. The other packages are not part of the standard library, so you need to install them if you haven’t done so already. You can install them using the Pkg package as follows:
# Install necessary packages
# This code installs the packages if they are not already installed.
using Pkg
Pkg.add(["Plots", "Dates", "CSV", "DataFrames", "HTTP"])Once the packages are installed, we can load them using the using keyword. The Plots package is used for plotting the data, so it is not strictly necessary to load it if you are only downloading and processing the data.
By default, GISTEMP data is in a wide format, with a column for each month. We will convert it to a long format, where each row corresponds to a single month using the longseries function.
# Function to convert wide format data to long format
function longseries(data)
height = size(data, 1) # Number of rows, equivalent to the number of years
last_row = 12 - count(ismissing, data[end, 2:13]) # Number of non-missing months in the last year
many = (height - 1) * 12 + last_row # Total number of months in the long format
long = zeros(many, 1) # Long format array
for ii = 1:(height-1) # Loop through all years except the last one
for jj = 1:12 # Loop through all months
long[(ii-1)*12+jj] = data[ii, jj+1]
end
end
for jj = 1:last_row # Loop through the last year
long[(height-1)*12+jj] = data[height, jj+1]
end
return long
endlongseries (generic function with 1 method)
HadCRUT5
The HadCRUT5 dataset is a global monthly average temperature dataset compiled by the Met Office Hadley Centre and the Climatic Research Unit at the University of East Anglia (Morice et al. 2021). It is one of the most widely used datasets for global temperature anomalies. The HadCRUT5 dataset is available in CSV format from the Met Office website. The code below downloads the data, processes it, and saves it in a CSV file. The data is then used to calculate the temperature anomalies relative to the 1850-1900 baseline.
# Download the HadCRUT temperature data
# URL of the HadCRUT5 global monthly average CSV
hurl = "https://www.metoffice.gov.uk/hadobs/hadcrut5/data/HadCRUT.5.0.2.0/analysis/diagnostics/HadCRUT.5.0.2.0.analysis.summary_series.global.monthly.csv"
# Local filename to save
hfilename = "data/HadCRUT5_global_monthly_average.csv"
open(hfilename, "w") do io
write(io, HTTP.get(hurl).body)
end
rawhadcrut = CSV.read(hfilename, DataFrame)
rename!(rawhadcrut, :Time => :Date)
rename!(rawhadcrut, :"Anomaly (deg C)" => :RawTemperature)
hadcrut = rawhadcrut[!, [:Date, :RawTemperature]]
oldbase = mean(hadcrut[(hadcrut.Date.>=Date(1850, 1, 1)).&(hadcrut.Date.<Date(1900, 1, 1)), :RawTemperature])
hadcrut[!, :Temp] = hadcrut[!, :RawTemperature] .- oldbase;
CSV.write(hfilename, hadcrut)
first(hadcrut, 5) # Show the first 5 rows of the HadCRUT5 data| Row | Date | RawTemperature | Temp |
|---|---|---|---|
| Date | Float64 | Float64 | |
| 1 | 1850-01-01 | -0.674564 | -0.31563 |
| 2 | 1850-02-01 | -0.333416 | 0.0255188 |
| 3 | 1850-03-01 | -0.591323 | -0.232388 |
| 4 | 1850-04-01 | -0.588721 | -0.229786 |
| 5 | 1850-05-01 | -0.508817 | -0.149882 |
The HadCRUT5 dataset is now saved here HadCRUT5_global_monthly_average.csv. The data contains the date, raw temperature anomalies, and the temperature anomalies relative to the 1850-1900 baseline.
GISTEMP
The GISTEMP dataset is a global monthly average temperature dataset (GISTEMP 2020). It is available in CSV format from the NASA GISS website. The code below downloads the data, processes it, and saves it in a CSV file. The data is then used to calculate the temperature anomalies relative to the 1850-1900 baseline.
Note that the GISTEMP data is in a wide format, with a column for each month. We will convert it to a long format, where each row corresponds to a single month using the longseries function defined above.
Moreover, the GISTEMP data starts in 1880, see below. For consistency, the temperature anomalies are calculated relative to the 1850-1900 baseline. Following the data source recommendation, to calculate the GISTEMP anomaly with respect to 1850-1900, we can adjust to 1880-1899 from 1951-1980, and then make a small adjustment of 0.038°C to account for the pre-1880 data.
# Download the GISTEMP temperature data
# URL of the GISTEMP global monthly average CSV
gurl = "https://data.giss.nasa.gov/gistemp/tabledata_v4/GLB.Ts%2BdSST.csv"
# Local filename to save
gfilename = "data/GISTEMP_global_monthly_average.csv"
# Download the file
open(gfilename, "w") do io
write(io, HTTP.get(gurl).body)
end
longgistemp = CSV.read(gfilename, DataFrame, header=2, missingstring=["***"])
gistemp = longseries(longgistemp)[:]
Tt = length(gistemp) - 1
start = Date(1880, 1, 1) # Start date of the dataset
fin = start + Month(Tt) # End date of the dataset
fechas = collect(start:Month(1):fin) # Create a Date array
gistemp = DataFrame(:Date=>fechas, :RawTemp=>gistemp)
oldbase = mean(gistemp[(gistemp.Date.>=Date(1880, 1, 1)).&(gistemp.Date.<Date(1900, 1, 1)), :RawTemp])
gistemp[!, :Temp] = gistemp[!, :RawTemp] .- oldbase .+ 0.038 # Adjust for pre-1880 data
CSV.write(gfilename, gistemp)
first(gistemp, 5) # Show the first 5 rows of the GISTEMP data| Row | Date | RawTemp | Temp |
|---|---|---|---|
| Date | Float64 | Float64 | |
| 1 | 1880-01-01 | -0.19 | 0.0734167 |
| 2 | 1880-02-01 | -0.25 | 0.0134167 |
| 3 | 1880-03-01 | -0.09 | 0.173417 |
| 4 | 1880-04-01 | -0.16 | 0.103417 |
| 5 | 1880-05-01 | -0.1 | 0.163417 |
The GISTEMP dataset is now saved here GISTEMP_global_monthly_average.csv. The data contains the date, raw temperature anomalies, and the temperature anomalies relative to the 1850-1900 baseline.
NOAAGlobalTemp
The NOAAGlobalTemp dataset is a global monthly average temperature dataset compiled by the National Oceanic and Atmospheric Administration (NOAA) (Huang et al. 2024). It is available in CSV format from the NOAA NCEI website. The code below downloads the data, processes it, and saves it in a CSV file. The data is then used to calculate the temperature anomalies relative to the 1850-1900 baseline.
# Download the NOAAGlobalTemp temperature data
# URL of the NOAAGlobalTemp global monthly average CSV
nurl = "https://www.ncei.noaa.gov/data/noaa-global-surface-temperature/v6/access/timeseries/aravg.mon.land_ocean.90S.90N.v6.0.0.202508.asc"
# Local filename to save
nfilename = "data/NOAA_global_monthly_average.csv"
# Download the file
open(nfilename, "w") do io
write(io, HTTP.get(nurl).body)
end
lines = readlines(nfilename)
cleaned_lines = [join(split(strip(line)), ",") for line in lines]
# Write to file
write(nfilename, join(cleaned_lines, "\n"))
rawnoaa = CSV.read(nfilename, DataFrame; delim=',', header=0)
fechas = Date.(rawnoaa.Column1, rawnoaa.Column2, 1) # Create Date column from Column1 and Column2
noaa = DataFrame(:Date=>fechas, :RawTemp=>rawnoaa.Column3)
oldbase = mean(noaa[(noaa.Date.>=Date(1850, 1, 1)).&(noaa.Date.<Date(1900, 1, 1)), :RawTemp])
noaa[!, :Temp] = noaa[!, :RawTemp] .- oldbase
CSV.write(nfilename, noaa)
first(noaa, 5) # Show the first 5 rows of the NOAAGlobalTemp data| Row | Date | RawTemp | Temp |
|---|---|---|---|
| Date | Float64 | Float64 | |
| 1 | 1850-01-01 | -0.751369 | -0.279708 |
| 2 | 1850-02-01 | -0.527868 | -0.0562065 |
| 3 | 1850-03-01 | -0.542508 | -0.0708465 |
| 4 | 1850-04-01 | -0.655912 | -0.184251 |
| 5 | 1850-05-01 | -0.586003 | -0.114342 |
The NOAAGlobalTemp dataset is now saved here NOAA_global_monthly_average.csv. The data contains the date, raw temperature anomalies, and the temperature anomalies relative to the 1850-1900 baseline.
Berkeley Earth
The Berkeley Earth dataset is a global monthly average temperature dataset (Rohde and Hausfather 2020). It is available in CSV format from the Berkeley Earth website. The code below downloads the data, processes it, and saves it in a CSV file. The data is then used to calculate the temperature anomalies relative to the 1850-1900 baseline.
# Download the Berkeley Earth temperature data
# URL of the Berkeley Earth global monthly average CSV
burl = "https://storage.googleapis.com/berkeley-earth-temperature-hr/global/Global_TAVG_monthly.txt"
# Local filename to save
bfilename = "data/BerkeleyEarth_global_monthly_average.csv"
# Download the file
open(bfilename, "w") do io
write(io, HTTP.get(burl).body)
end
rawtemp = CSV.read(bfilename, DataFrame, comment="%", delim=" ", ignorerepeated=true)
colnames = [:Year, :Month, :Anomaly_Monthly, :Unc_Monthly,
:Anomaly_Annual, :Unc_Annual, :Anomaly_5yr, :Unc_5yr,
:Anomaly_10yr, :Unc_10yr, :Anomaly_20yr, :Unc_20yr]
rename!(rawtemp, colnames)
rawtemp.Date = Date.(rawtemp.Year, rawtemp.Month, 1) # Create Date column from Year and Month
rename!(rawtemp, :Anomaly_Monthly => :RawTemperature)
berkeley = rawtemp[!, [:Date, :RawTemperature]]
oldbase = mean(rawtemp[(rawtemp.Date.>=Date(1850, 1, 1)).&(rawtemp.Date.<Date(1900, 1, 1)), :RawTemperature])
rawtemp[!, :Temp] = rawtemp[!, :RawTemperature] .- oldbase
berkeley.Temp = rawtemp.Temp
CSV.write(bfilename, berkeley)
first(berkeley, 5) # Show the first 5 rows of the Berkeley Earth data| Row | Date | RawTemperature | Temp |
|---|---|---|---|
| Date | Float64 | Float64 | |
| 1 | 1850-01-01 | -0.473 | -0.181233 |
| 2 | 1850-02-01 | -0.681 | -0.389233 |
| 3 | 1850-03-01 | -0.427 | -0.135233 |
| 4 | 1850-04-01 | -0.681 | -0.389233 |
| 5 | 1850-05-01 | -0.39 | -0.0982333 |
The Berkeley Earth dataset is now saved here BerkeleyEarth_global_monthly_average.csv. The data contains the date, raw temperature anomalies, and the temperature anomalies relative to the 1850-1900 baseline.
Oceanic Niño Index (ONI)
El Niño (La Niña) is a phenomenon in the equatorial Pacific Ocean characterized by a five consecutive 3-month running mean of sea surface temperature (SST) anomalies in the Niño 3.4 region that is above (below) the threshold of +0.5°C (-0.5°C). To keep the data frequency consistent, we will use the same monthly time resolution as the other datasets; hence using the SST directly.
The SST data is obtained from the Extended Reconstructed Sea Surface Temperature (ERSST) dataset, which is a global monthly analysis of SST data derived from the International Comprehensive Ocean–Atmosphere Dataset (ICOADS) (Huang et al. 2025a, 2025b). The ONI data is available in CSV format from the NOAA Climate Monitoring website. The code below downloads the data, processes it, and saves it in a CSV file.
# Download the ONI data
ourl = "https://www.cpc.ncep.noaa.gov/data/indices/sstoi.indices"
ofilename = "data/Nino_data.csv"
open(ofilename, "w") do io
write(io, HTTP.get(ourl).body)
end
lines = readlines(ofilename)
cleaned_lines = [join(split(strip(line)), ",") for line in lines]
# Write to file
write(ofilename, join(cleaned_lines, "\n"))
rawoni = CSV.read(ofilename, DataFrame; delim=',', header=1)
fechas = Date.(rawoni.YR, rawoni.MON, 1) # Create Date column from YR and MON
oni = DataFrame(Date=fechas, Anom=rawoni[!, :ANOM_3])
CSV.write(ofilename, oni)
first(oni, 5) # Show the first 5 rows of the ONI data| Row | Date | Anom |
|---|---|---|
| Date | Float64 | |
| 1 | 1982-01-01 | 0.08 |
| 2 | 1982-02-01 | -0.2 |
| 3 | 1982-03-01 | -0.14 |
| 4 | 1982-04-01 | 0.02 |
| 5 | 1982-05-01 | 0.49 |
The ONI dataset is now saved here Nino_data.csv. The data contains the date and the ONI anomalies.
Merge all datasets
The code below merges all the datasets into a single dataset. It uses the leftjoin function to merge the datasets on the Date column. The resulting dataset contains the temperature anomalies for each dataset, as well as the ONI data. The merged dataset is saved in a CSV file.
# Load the datasets
hadcrut = CSV.read(hfilename, DataFrame)
gistemp = CSV.read(gfilename, DataFrame)
noaa = CSV.read(nfilename, DataFrame)
berkeley = CSV.read(bfilename, DataFrame)
oni = CSV.read(ofilename, DataFrame)
# Dates
min_date = minimum([minimum(hadcrut.Date), minimum(gistemp.Date), minimum(noaa.Date), minimum(berkeley.Date), minimum(oni.Date)])
max_date = maximum([maximum(hadcrut.Date), maximum(gistemp.Date), maximum(noaa.Date), maximum(berkeley.Date), maximum(oni.Date)])
complete_dates = collect(min_date:Month(1):max_date)
compiled_data = DataFrame(Date=complete_dates)
# HadCRUT5
compiled_data = leftjoin(compiled_data, hadcrut, on = :Date)
rename!(compiled_data, :RawTemperature => :HadCRUT_RawTemperature)
rename!(compiled_data, :Temp => :HadCRUT_Temp)
sort!(compiled_data, :Date)
# GISTEMP
compiled_data = leftjoin(compiled_data, gistemp, on = :Date)
rename!(compiled_data, :RawTemp => :GISTEMP_RawTemperature)
rename!(compiled_data, :Temp => :GISTEMP_Temp)
sort!(compiled_data, :Date)
# NOAA
compiled_data = leftjoin(compiled_data, noaa, on = :Date)
rename!(compiled_data, :RawTemp => :NOAA_RawTemperature)
rename!(compiled_data, :Temp => :NOAA_Temp)
sort!(compiled_data, :Date)
# Berkeley Earth
compiled_data = leftjoin(compiled_data, berkeley, on = :Date)
rename!(compiled_data, :RawTemperature => :Berkeley_RawTemperature)
rename!(compiled_data, :Temp => :Berkeley_Temp)
sort!(compiled_data, :Date)
# ONI
compiled_data = leftjoin(compiled_data, oni, on = :Date)
rename!(compiled_data, :Anom => :ONI_Anomaly)
sort!(compiled_data, :Date)
# Save the compiled data
compiled_filename = "data/Compiled_Global_Temperature_Data.csv"
CSV.write(compiled_filename, compiled_data)
first(compiled_data, 5) # Show the first 5 rows of the compiled data| Row | Date | HadCRUT_RawTemperature | HadCRUT_Temp | GISTEMP_RawTemperature | GISTEMP_Temp | NOAA_RawTemperature | NOAA_Temp | Berkeley_RawTemperature | Berkeley_Temp | ONI_Anomaly |
|---|---|---|---|---|---|---|---|---|---|---|
| Date | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | |
| 1 | 1850-01-01 | -0.674564 | -0.31563 | missing | missing | -0.751369 | -0.279708 | -0.473 | -0.181233 | missing |
| 2 | 1850-02-01 | -0.333416 | 0.0255188 | missing | missing | -0.527868 | -0.0562065 | -0.681 | -0.389233 | missing |
| 3 | 1850-03-01 | -0.591323 | -0.232388 | missing | missing | -0.542508 | -0.0708465 | -0.427 | -0.135233 | missing |
| 4 | 1850-04-01 | -0.588721 | -0.229786 | missing | missing | -0.655912 | -0.184251 | -0.681 | -0.389233 | missing |
| 5 | 1850-05-01 | -0.508817 | -0.149882 | missing | missing | -0.586003 | -0.114342 | -0.39 | -0.0982333 | missing |
The compiled dataset is now saved here Compiled_Global_Temperature_Data.csv. The data contains the date, raw temperature anomalies for each dataset, and the temperature anomalies relative to the 1850-1900 baseline. It also includes the ONI anomalies.
Plot the data
Loading the compiled data and setting plot aesthetics.
# Load the compiled data and plot packages
using Plots
compiled_data = CSV.read(compiled_filename, DataFrame)
# Set plot aesthetics
theme(:ggplot2)
default(
fontfamily = "Computer Modern",
tickfontsize = 10, legendfontsize = 10,
titlefontsize = 12,
xlabelfontsize = 10,
ylabelfontsize = 10,
titlefontfamily = "Computer Modern",
legendfontfamily = "Computer Modern",
tickfontfamily = "Computer Modern",
dpi = 500
)
# Extract the dates for x-axis ticks
# This will be used for the x-axis ticks in the plot
xls = compiled_data.Date;Precompiling Plots... 296.4 ms ✓ EpollShim_jll 308.6 ms ✓ Xorg_libXau_jll 312.4 ms ✓ Xorg_libICE_jll 322.9 ms ✓ Libmount_jll 340.5 ms ✓ Bzip2_jll 337.2 ms ✓ libfdk_aac_jll 341.4 ms ✓ LLVMOpenMP_jll 355.2 ms ✓ libpng_jll 361.2 ms ✓ Graphite2_jll 315.2 ms ✓ LERC_jll 341.1 ms ✓ LAME_jll 295.1 ms ✓ mtdev_jll 300.6 ms ✓ Xorg_libXdmcp_jll 350.4 ms ✓ fzf_jll 355.4 ms ✓ JpegTurbo_jll 342.8 ms ✓ Ogg_jll 367.0 ms ✓ XZ_jll 358.8 ms ✓ x265_jll 347.6 ms ✓ x264_jll 348.0 ms ✓ libaom_jll 348.0 ms ✓ Zstd_jll 302.4 ms ✓ Xorg_xtrans_jll 333.4 ms ✓ LZO_jll 343.1 ms ✓ Expat_jll 326.0 ms ✓ Opus_jll 310.9 ms ✓ libevdev_jll 346.2 ms ✓ Libiconv_jll 277.6 ms ✓ Xorg_libSM_jll 285.6 ms ✓ eudev_jll 307.5 ms ✓ Libffi_jll 293.5 ms ✓ Libuuid_jll 304.8 ms ✓ FriBidi_jll 286.1 ms ✓ Xorg_libxcb_jll 253.9 ms ✓ Dbus_jll 250.5 ms ✓ libinput_jll 249.4 ms ✓ Wayland_jll 587.6 ms ✓ Pixman_jll 774.4 ms ✓ FreeType2_jll 240.6 ms ✓ Xorg_xcb_util_jll 237.8 ms ✓ Xorg_libX11_jll 258.9 ms ✓ Xorg_xcb_util_image_jll 256.8 ms ✓ Xorg_xcb_util_renderutil_jll 258.6 ms ✓ Xorg_xcb_util_keysyms_jll 259.7 ms ✓ Xorg_xcb_util_wm_jll 1062.3 ms ✓ JLFzf 947.4 ms ✓ libvorbis_jll 265.1 ms ✓ Xorg_libxkbfile_jll 269.3 ms ✓ Xorg_libXfixes_jll 270.7 ms ✓ Xorg_libXext_jll 271.9 ms ✓ Xorg_libXrender_jll 268.3 ms ✓ Xorg_xcb_util_cursor_jll 262.7 ms ✓ Xorg_libXcursor_jll 269.7 ms ✓ Xorg_xkbcomp_jll 265.6 ms ✓ Xorg_libXinerama_jll 265.6 ms ✓ Libglvnd_jll 275.6 ms ✓ Xorg_libXi_jll 1347.5 ms ✓ GettextRuntime_jll 258.3 ms ✓ Xorg_libXrandr_jll 902.0 ms ✓ Xorg_xkeyboard_config_jll 231.3 ms ✓ xkbcommon_jll 248.3 ms ✓ Vulkan_Loader_jll 2522.1 ms ✓ Fontconfig_jll 1863.9 ms ✓ Glib_jll 3463.7 ms ✓ Libtiff_jll 1966.7 ms ✓ Cairo_jll 978.8 ms ✓ HarfBuzz_jll 1034.7 ms ✓ libass_jll 1049.4 ms ✓ Pango_jll 227.4 ms ✓ libdecor_jll 236.4 ms ✓ GLFW_jll 4954.3 ms ✓ Qt6Base_jll 1077.6 ms ✓ Qt6ShaderTools_jll 2960.1 ms ✓ FFMPEG_jll 1104.9 ms ✓ Qt6Declarative_jll 222.5 ms ✓ Qt6Wayland_jll 800.9 ms ✓ FFMPEG 1371.0 ms ✓ GR_jll 1756.1 ms ✓ GR 21524.9 ms ✓ Plots 2378.3 ms ✓ Plots → UnitfulExt 80 dependencies successfully precompiled in 39 seconds. 100 already precompiled.
Plotting the data.
# Plot the data one dataset at a time, for clarity
p = plot(compiled_data.Date, compiled_data.HadCRUT_Temp, label="HadCRUT5", xlabel="Date (monthly)", ylabel="Temperature Anomaly (°C)", title="Global Temperature Anomalies", linewidth=0.5, markershape=:circle, markersize=1)
plot!(compiled_data.Date, compiled_data.GISTEMP_Temp, label="GISTEMP", xlabel="Date (monthly)", ylabel="Temperature Anomaly (°C)", title="Global Temperature Anomalies" , linewidth=0.5, markershape=:diamond, markersize=1)
plot!(compiled_data.Date, compiled_data.NOAA_Temp, label="NOAAGlobalTemp", xlabel="Date (monthly)", ylabel="Temperature Anomaly (°C)", title="Global Temperature Anomalies" , linewidth=0.5, markershape=:+, markersize=1)
plot!(compiled_data.Date, compiled_data.Berkeley_Temp, label="Berkeley Earth", xlabel="Date (monthly)", ylabel="Temperature Anomaly (°C)", title="Global Temperature Anomalies", linewidth=0.5, markershape=:xcross, markersize=1)
plot!(legend=:topleft, xticks=(xls[1:180:end], Dates.format.(xls[1:180:end], "Y"))) # Set x-axis ticks every 180 months
display(p)The plot shows the global temperature anomalies for each dataset. The HadCRUT5 dataset is shown in blue, GISTEMP in orange, NOAAGlobalTemp in green, and Berkeley Earth in purple. The x-axis represents the date (monthly), and the y-axis represents the temperature anomaly in degrees Celsius.
# Save the plot
savefig("data/Global_Temperature_Anomalies.pdf")The plot is saved as a PDF file as Global_Temperature_Anomalies.pdf.
The last 30 years
Zooming in on the last 30 years of data.
# Zoom in on the last 30 years of data
compiled_data_zoomed = compiled_data[compiled_data.Date .>= Date(1995, 1, 1), :]
p_zoomed = plot(compiled_data_zoomed.Date, compiled_data_zoomed.HadCRUT_Temp, label="HadCRUT5", xlabel="Date (monthly)", ylabel="Temperature Anomaly (°C)", title="Global Temperature Anomalies (Last 30 Years)", linewidth=0.5, markershape=:circle, markersize=1)
plot!(compiled_data_zoomed.Date, compiled_data_zoomed.GISTEMP_Temp, label="GISTEMP", xlabel="Date (monthly)", ylabel="Temperature Anomaly (°C)", title="Global Temperature Anomalies (Last 30 Years)", linewidth=0.5, markershape=:diamond, markersize=1)
plot!(compiled_data_zoomed.Date, compiled_data_zoomed.NOAA_Temp, label="NOAAGlobalTemp", xlabel="Date (monthly)", ylabel="Temperature Anomaly (°C)", title="Global Temperature Anomalies (Last 30 Years)", linewidth=0.5, markershape=:+, markersize=1)
plot!(compiled_data_zoomed.Date, compiled_data_zoomed.Berkeley_Temp, label="Berkeley Earth", xlabel="Date (monthly)", ylabel="Temperature Anomaly (°C)", title="Global Temperature Anomalies (Last 30 Years)", linewidth=0.5, markershape=:xcross, markersize=1)
plot!(legend=:topleft, xticks=(compiled_data_zoomed.Date[1:60:end], Dates.format.(compiled_data_zoomed.Date[1:60:end], "Y")))
display(p_zoomed)The zoomed-in plot shows the global temperature anomalies for each dataset over the last 30 years. The HadCRUT5 dataset is shown in blue, GISTEMP in orange, NOAAGlobalTemp in green, and Berkeley Earth in purple. The x-axis represents the date (monthly), and the y-axis represents the temperature anomaly in degrees Celsius.
# Save the zoomed plot
savefig("data/Global_Temperature_Anomalies_Last_30_Years.pdf")The plot is saved as a PDF file as Global_Temperature_Anomalies_Last_30_Years.pdf.
Adding El Niño and La Niña events
To add the periods of El Niño and La Niña events to the plot, we will use the Oceanic Niño Index (ONI) anomalies. An El Niño event is defined as a period when the ONI anomaly is above +0.5°C for five consecutive 3-month running means, while a La Niña event is defined as a period when the ONI anomaly is below -0.5°C for five consecutive 3-month running means. Nonetheless, to keep the data frequency consistent, we will use the monthly ONI anomalies directl, which are available in the ONI dataset.
First, we need to classify the ONI anomalies.
# Classify ONI anomalies into El Niño, La Niña, and Neutral events
ONI_Anomaly = compiled_data_zoomed.ONI_Anomaly[.!ismissing.(compiled_data_zoomed.ONI_Anomaly)]
T = length(ONI_Anomaly)
indicator = zeros(Int, T)
for i in 1:T
if ONI_Anomaly[i] >= 0.5
indicator[i] = 1
elseif ONI_Anomaly[i] <= -0.5
indicator[i] = -1
else
indicator[i] = 0
end
endThen we can add the El Niño and La Niña events to the plot. The shaded areas will indicate the periods of El Niño (red) and La Niña (blue) events based on the ONI anomalies.
p_zoomed_oni = p_zoomed
i = 1
while i <= length(indicator)
current_val = indicator[i]
if current_val in (-1, 1)
start_idx = i
while i <= length(indicator) && indicator[i] == current_val
i = i + 1
end
stop_idx = i - 1
if (stop_idx <= start_idx) || (stop_idx - start_idx < 4)
continue
else
vspan!(p_zoomed_oni, [compiled_data_zoomed.Date[start_idx], compiled_data_zoomed.Date[stop_idx]], color=current_val == 1 ? :red : :blue, alpha=0.1, label ="")
end
else
i = i + 1
end
end
display(p_zoomed_oni)The plot now shows the global temperature anomalies for each dataset over the last 30 years, with shaded areas indicating El Niño (red) and La Niña (blue) events based on the ONI anomalies. The x-axis represents the date (monthly), and the y-axis represents the temperature anomaly in degrees Celsius.
# Save the plot with ONI events
savefig("data/Global_Temperature_Anomalies_Last_30_Years_ONI.pdf")The plot is saved as a PDF file as Global_Temperature_Anomalies_Last_30_Years_ONI.pdf.
References
Citation
If you use any of the data or code in this notebook, please cite the original datasets and this notebook as follows:
@article{vera-valdés2024,
author = {Vera-Valdés, J. Eduardo and Kvist, Olivia},
title = {Breaching 1.5°C: Give Me the Odds},
journal = {arXiv},
date = {2024-12-17},
url = {https://arxiv.org/abs/2412.13855},
doi = {10.48550/arXiv.2412.13855}
}